A multi-row deletion diagnostic for influential observations in small-sample regressions

نویسندگان

  • Daniel T. Kaffine
  • Graham A. Davis
چکیده

Policy makers often look to economic research for guidance. Econometric studies examining actual economic outcomes for countries under different historical policy regimes are particularly valuable. These studies are sometimes performed on small (N ≈ 100) data sets. A particular concern when working with these data sets is the potential for the inference from the regression to be sensitive to a few influential data points. For context, consider the case of growth regressions. Growth regressions have been criticized on many levels, one of which is their instability as the sample is altered in small ways. In many cases, small changes in sample lead to a reversal of the economic inference from the modeling exercise. In one example, a well-cited study finding large benefits of investment on economic growth was shown to hinge on the inclusion of Botswana in the sample. Another study’s finding that high public debt to GDP ratios cause measurably slower economic growth, an outcome that heavily influenced political platforms in elections in the United States and Great Britain, was later shown to hold only if New Zealand is included in the sample. Despite this, the sensitivity of economic inference to sample has received little systematic attention and is rarely rigorously explored by growth researchers. The authors’ premise is that researchers and policy makers are interested in knowing when the policy inference from a regression analysis is driven by a single observation or small group of observations, regardless of what the formal regression statistics say for their baseline regression. For example, for small changes in sample, do the signs on the coefficients of interest change? Does the magnitude of a coefficient move from being meaningful to irrelevant? Does the t-statistic change from signaling statistical significance to signaling statistical insignificance? In this paper the authors present a multi-row deletion analysis (MRDA) data analytics approach to systematically test small-sample regression results for the presence of influential points or groups of influential points. The approach is complementary to two other approaches often used in such diagnostics: DFBETAS and robust regression. Using both simulated and real data, the authors show that MRDA provides insights into sample sensitivity that these other approaches miss. In their real data analysis, they take a second look at the data used in a well-cited paper examining the impact of institutional quality on the resource curse. That paper suggested that resource-rich countries with good enough institutional quality had accelerated growth as a result of their resource endowment. Countries with poor institutional quality had slower growth. The World Bank makes frequent note of this result when it discusses developing institutional capacity in resourcerich developing nations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detection of Outliers and Influential Observations in Linear Ridge Measurement Error Models with Stochastic Linear Restrictions

The aim of this paper is to propose some diagnostic methods in linear ridge measurement error models with stochastic linear restrictions using the corrected likelihood. Based on the bias-corrected estimation of model parameters, diagnostic measures are developed to identify outlying and influential observations. In addition, we derive the corrected score test statistic for outliers detection ba...

متن کامل

Influence Measures in Ridge Linear Measurement Error Models

Usually the existence of influential observations is complicated by the presence of collinearity in linear measurement error models. However no method of influence measure available for the possible effect's that collinearity can have on the influence of an observation in such models. In this paper, a new type of ridge estimator based corrected likelihood function (REC) for linear measurement e...

متن کامل

Diagnostic Measures in Ridge Regression Model with AR(1) Errors under the Stochastic Linear Restrictions

Outliers and influential observations have important effects on the regression analysis. The goal of this paper is to extend the mean-shift model for detecting outliers in case of ridge regression model in the presence of stochastic linear restrictions when the error terms follow by an autoregressive AR(1) process. Furthermore, extensions of measures for diagnosing influential observations are ...

متن کامل

Influence Diagnostics in Two-Parameter Ridge Regression

Abstract: Identifying influential observations is an important part of the model building process in linear regression. There are numerous diagnostic measures based on different approaches in linear regression analysis. However, the problem of multicollinearity and influential observations may occur simultaneously. Therefore, we propose new diagnostic measures based on the two parameter ridge e...

متن کامل

Case-deletion diagnostics for maximum likelihood multipoint quantitative trait locus linkage analysis.

OBJECTIVES Case-deletion diagnostic methods are tools that allow identification of influential observations that may affect parameter estimates and model fitting conclusions. The goal of this paper was to develop two case-deletion diagnostics, the exact case deletion (ECD) and the empirical influence function (EIF), for detecting outliers that can affect results of sib-pair maximum likelihood q...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 108  شماره 

صفحات  -

تاریخ انتشار 2017